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To address an important risk classification issue that arises in 
clinical practice, we propose a new mixture model via latent cure rate 
markers for survival data with a cure fraction. In the proposed model, 
the latent cure rate markers are modeled via a multinomial logistic 
regression and patients who share the same cure rate are classified 
into the same risk group. Compared to available cure rate models, 
the proposed model fits better to data from a prostate cancer clini- 
cal trial. In addition, the proposed model can be used to determine 
the number of risk groups and to develop a predictive classification 
algorithm. 

1. Introduction. In cure rate modeling of event-time data, a fraction of 
tlie population is considered to iiave zero hazard. The model is often suit- 
able for survival data from cancer clinical trials owing to advances in medical 
treatment and health care. For example, treatment of prostate cancer rou- 
tinely cures the patient in the sense of completely eradicating the disease. 
Existing cure rate models are able to accommodate a fraction of the popula- 
tion being cured [Berkson and Gage (1952) and Mailer and Zhou (1996)] and 
characteristics of tumor growth [Chen, Ibrahim and Sinha (1999), Tsodikov, 
Ibrahim and Yakovlev (2003) and Cooner et al. (2007)]. However, risk-group 
information is not easily incorporated. Prostate cancer patients can be classi- 
fied into low, intermediate and high-risk groups on the basis of pre-treatment 
characteristics, such as the level of prostate-specific antigen (PSA), biopsy 
Gleason scores or clinical tumor categories [D'Amico et al. (1998, 2002)]. A 
failure to incorporate risk stratification into the cure rate model can lead to 
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poorly fitting statistical models and poorly estimated cure rates and predic- 
tive probabilities of risk groups. We address this problem via a latent class 
analysis of the cure rate model. 

We consider data from a retrospective cohort study of n = 1235 men 
treated with radical prostatectomy (RP) at Brigham and Women's Hospi- 
tal between 1988-2001, which is a subset of the data published in D'Amico 
et al. (2002). The primary endpoint is the time to prostate-specific anti- 
gen (PSA) recurrence or to the last follow-up, whichever came first. There 
were 261 patients who had PSA recurrence after the radical prostatectomy. 
We consider four prognostic factors: natural logarithm of prostate specific 
antigen (LogPSA) prior to RP, biopsy Gleason score, the 1992 American 
Joint Commission on Cancer (AJCC) clinical tumor category and the year 
of radical prostatectomy (Year). D'Amico et al. (2002) considered three risk 
groups based on PSA, biopsy Gleason score and clinical tumor category and 
reported the estimates of 8-year PSA recurrence free survival for the three 
risk groups based on the Kaplan-Mier (KM) method [Kaplan and Meier 
(1958)]. For the subset of the data considered here, the KM estimates of 
8-year PSA recurrence free survival are 88.7%, 57.4% and 23.4% for low-risk 
patients (Tic, T2a, a PSA level < 10 ng/mL), intermediate-risk patients 
(T2b or Gleason score 7 or a PSA level > 10 and < 20 ng/mL) and high-risk 
patients (T2c or PSA level > 20 ng/mL or Gleason score > 8), respectively. 
This risk classification does capture that the low-risk patients have the high- 
est PSA recurrence free survival and the high-risk patients have the lowest 
PSA recurrence free survival. However, there are some limitations of this risk 
classification. First, this risk classification is deterministic. In other words, it 
is not associated with predictive probabilities of risk groups. Second, it may 
be problematic, especially for those patients whose clinical characteristics 
fall within the boundary between two risk groups. Third, this risk classifi- 
cation is not fiexible enough to incorporate other potentially important risk 
factors. For example, the year of diagnosis or treatment may have a signifi- 
cant effect on the "cure rate" and, thus, it may be an important factor for 
risk classification. To overcome these limitations, we develop a predictive 
classification algorithm based on the latent cure rate marker model. This al- 
gorithm first computes the probabilities of risk groups for a patient based on 
his clinical characteristics and then classifies him to a particular risk group 
with the largest predictive probability. As shown in Table 4 in Section 5, for 
a patient who had PSA of 5, and Gleason score < 6, clinical stage Tl and 
surgery in 2001, the predictive probabilities for three risk groups are 0.745, 
0.241 and 0.014 and, thus, this patient will be classified into the "low risk" 
group. 

Overdiagnosis of clinically insignificant prostate cancer was considered a 
major issue of prostate-specific antigen (PSA) screening since the U.S. Food 
and Drug Administration approved PSA testing in 1986 as a way to monitor 
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prostate cancer progression [Wang and Arnold (2002)]. Etzioni et al. (2002) 
estimated rates of prostate cancer overdiagnosis due to PSA testing among 
men who were 60 to 84 years old in 1988. Overdiagnosis may occur when 
older men or men with comorbid illness who have very low risk disease are 
treated. However, overdiagnosis is usually not the case for men treated with 
surgery because they are healthy but they can have very low risk disease. 
Since the data we analyze in this paper were from those men who went 
to surgery, it may be appropriate to fit a cure rate model to this particular 
prostate cancer data set. We include the year of RP in the analysis, as it may 
have a significant effect on the "cure rate." There are two reasons for this. 
With time people are diagnosed after several PSA tests and serial screened 
men are diagnosed earlier with more favorable disease [e.g., Martin et al. 
(2008)] and with increased medical experience. Over time, the techniques of 
treatment also improve, which can improve outcome due to a learning curve, 
especially when new surgery (e.g., robotic RP) or radiation therapy (e.g., 
seed therapy) techniques are used. We fit both the Cox proportional hazards 
regression model [Cox (1972)] and the proposed latent cure rate marker 
model with a piecewise exponential baseline hazard function to this prostate 
cancer data. We then computed the logarithm of pseudomarginal likelihood 
(LPML) [Ibrahim, Chen and Sinha (2001), Chapter 6] and the Deviance 
Information Criterion (DIG) proposed by Spiegelhalter et al. (2002) for each 
model. From Table 2 in Section 5, we see that the best LPML and DIC 
values are —821.5 and 1640.8 for the Cox model and —816.0 and 1613.7 for 
the latent cure rate marker model. These results indicate that the cure rate 
model fit the data much better than the noncure rate model. Thus, a cure 
rate model is indeed needed for this data set. 

Section 2 provides the detailed development of the proposed latent cure 
rate model. The prior and posterior are discussed in Section 3. The pos- 
terior predictive classification algorithm is developed in Section 4. Section 
5 presents an analysis of the prostate cancer data. We conclude the paper 
with brief discussions in Section 6. 

2. The models. 

2.1. Preliminary. Let yi denote the observed survival time and let i/j 
be the censoring indicator that equals 1 if yi is a failure time and if it 
is right censored for the ith. subject. Also, let Ni denote the number of 
metastatic-competent tumor cells and assume that the iVj's are independent 
Poisson random variables with mean 6i. Suppose further that Wij denotes 
the random time for the jth carcinogenic cell to produce a detectable can- 
cer mass (incubation time for the jth carcinogenic cell) for the ith subject. 
We assume that the variables Wij, i = 1,2, . . . , are independent and dis- 
tributed with a common distribution function F{y), and are independent 
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of Ni. The time to relapse of cancer can be defined by the random vari- 
able Yi = mm{Wij,0 < j < N^}, where P{WiQ = oo) = 1. Then, the survival 
function for the cure rate model for the ith subject is given by 

(1) S,{y) = P{Y, >y)= exp{-eiF{y)}. 

Using (1), the cure rate is given by 5j(oo) =exp(— 0,), which is also equal 
to P{Ni = 0). Other properties of the cure rate model (1) can be found in 
Yakovlev et al. (1993), Yakovlev and Tsodikov (1996) and Chen, Ibrahim and 
Sinha (1999). To build a regression model, Chen, Ibrahim and Sinha (1999) 
introduced covariates through 6i via 9i = 9{x.^j3) = exp(x^/3), where Xj = 
{xii,Xi2, . ■ . iXip)' denotes the p x 1 vector of covariates for the ith subject 
and (3 = (/3i , , • • • , /9p)' is the corresponding vector of regression coefficients, 
i = 1,2, . . . ,n. Let S{y) = 1 - F{y) and f{y) = jj^F{y). Then the resulting 
survival function is given by 

(2) 5i(y|xi,/3) = exp{-exp(x^/3)F(y)}. 

We refer to (2) as the CIS model. 

A natural extension of the CIS model is the cure rate double regression 
model. Let /3i = (/3ii,/3i2, . • ■ ,/3ip)' and = {hi, hi-, ■ ■ ■■.hp)' ■ We assume 
9i = exp(x^/32). A proportional hazards model is assumed for the distribution 
function F{y) of the incubation time for noncured subjects. Specifically, 
let the cumulative hazard function H{y) = exp{x.^(32)HQ{y), where Ho{y) is 
the baseline cumulative hazard function. Then, F(y) = 1 — exp{—H{y)} = 
1 — exp{— exp(x^/32)-f^o(y)}- Under this assumption for F{y), we have 

(3) 5i(y|xi,/3) =exp(-exp(x-/3i)[l -exp{-exp(x-/32)i?o(y)}]), 

where (3= {j3'i^ fi'2)' ■ Yakovlev and Tsodikov (1996) used parametric accel- 
erated failure time effects on the cumulative hazards with a similar idea. 
Model (3) in its semiparametric form has appeared in Broet et al. (2001), 
where they tended to use a generalized Gompertz name for the model. 
We see, from (3), that the model in (3) incorporates the covariates into 
both the cure rate and the hazard function with double proportional haz- 
ards structures. Thus, we refer to this model as the PHPH model. The 
name "PHPH" was also introduced by Tsodikov (2002) for the semipara- 
metric version of the model. Recently, Liu, Lu and Shao (2006) developed 
the PHPH version of the standard cure rate model of Berkson and Gage 
(1952). 

Another extension of the CIS model is the latent activation cure rate 
(LACK) model proposed by Cooner et al. (2007). Given iVj > 1, let IVj(i) < 
M^i{2) < ••• < ^i(7Vi) denote the ordered values of the Wjj's. The time to 
relapse of cancer is defined by Yi = for I < Ri < Ni and Wio if iVj = 

0, where Ri is an integer valued variable. Cooner et al. (2007) specified 
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a conditional distribution for Ri given Ni, denoted by [i?j|A^j]. When Ni 
follows a Poisson distribution with mean 6i = exp(xj/3) and [i?i|A^j] is a 
discrete uniform on {1, 2, . . . , A^j} with probability the survival function 
under the LACR model is given by 

(4) S'i(y|xi,/3) = exp{- exp(xi/3)} + [1 - exp{- exp(xi/3)}]S'(?/). 

Other distributions for Ni and [iJjiA'j] are also considered in Cooner et al. 
(2007). 

2.2. A new latent cure rate marker model. The latent cure rate marker 
(LCRM) model assumes that the A'j's are independent Poisson random vari- 
ables with mean 9g^, where gt is a (unknown) group indicator, and exp{—9g.) 
is the cure rate marker. Let G denote the number of distinct values of 6g. . 
Further, gi {1 < gi < G) indicates the group membership. Without loss of 
generality, we assume 9i < 62 < • ■ ■ < 6g- Under these constraints, the group 
membership g^ is uniquely defined. Similar to the PHPH model, we assume 
the proportional hazards model for cumulative hazard function H{y), that 
is, H{y) = exp(x^/3)i/o(2/), where HQ{y) is the baseline cumulative hazard 
function. Then, under the LCRM model, the conditional survival function 
of yi given 9g. is of the form 

(5) 5,(?/|xi,/3,^3J = exp(-03Jl-exp{-exp(x^/3)i7o(?/)}]). 

We assume a multinomial logistic regression model for the latent group 
membership g^. To this end, let = (zjO; Zii,Zi2, ■ ■ ■ , Ziq) denote a ((7 + 1) x 1 
vector of covariates for the ith subject, which includes an intercept (i.e., 
Zio = 1) for i = 1, 2, . . . , n. Also let 4>j = {4>jO-,'t'ji-,4>j2-, ■ ■ ■ , 4>jq)' denote the 
corresponding vectors of regression coefficients for j = 1, 2, . . . , G and 0' = 
{(j)'i,4)'2, . . . , (jjQ^i). Then the density of the group membership gi is given by 

Ta=i exp(z^0z) 

For notational convenience, we let cf)Q = (0, 0, . . . , 0)'. Write 6' = (^1, ^2, • • • , do)- 
Using (5), the unconditional survival function of yi is given by 

5'i(2/|xi,Zi,/3,0,0) 

-^exp(-0fe[l-exp{-exp(x:/3)//o(y)}]): "''P^^^'^^^ 



k=i Ei=iexp(z>J 

Unlike the CIS model, the LCRM model does not directly link the co- 
variates to the cure fractions and instead it assumes that the population 
is characterized by an unobserved cure rate marker, namely, exp(— 0^.), 
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where the latent group membership gi is described according to covari- 
ates via a multinomial logistic regression model. We note that the mono- 
tonic constraints on the cure rates ^^'s not only define the group mem- 
bership gi but also ensure identifiability of the multinomial logistic regres- 
sion model. We also see, from (7), that the LCRM model is indeed a fi- 
nite mixture of cure rate models, 6^ ^ 9 for k = 1,2, . . . ,G, (7) reduces 
to 

5i(2/|xi,/3,6l) = exp(-6'[l - exp{-exp(x./3)i/o(y)}]), 

which is a special case of the PHPH model. 

We assume the piecewise exponential model for the baseline hazard func- 
tion hod/), which is constructed as follows. We first partition the time axis 
into J intervals: (soj-si], (si, •S2], • • • , (sj-i, sj], where sq = < si < S2 < 
■■■< sj = 00. We then assume a constant hazard Xj over the jth inter- 
val Ij = (sj-i, Sj]. That is, /io(y) = Aj if y £ Ij for j = 1, 2, . . . , J. Then the 
corresponding cumulative distribution function (c.d.f.), i<o(y|A), is given by 



(8) 



for s,- 



-^o(y|A) = 1 - exp<^ -Aj(y - Sj_i) - ^ Xg{sg - 



9=1 



<y < Sj, where A = (Ai, . . . , A j)' . We note that when J = 1, Fo(y|A) 
reduces to the parametric exponential model. 

Let D = (n, y, X, Z, u, N, g) denote the complete data, where y = (yi, . . . , yn)' 
V = {y\, 1^2, ... , t'n)') ^ is the n x p matrix of covariates with ith row x^, Z, 
which may share common components with X, is a g- vector of covariates 
with ith row z'-, N' = (iVi,iV2,- • ■,Nn), and g' = (51,52,- • ■,9n)- Then, the 
complete data likelihood under the LCRM model is given by 

L{(3,0,(I),X\D) 



n 



\Ui5i. 



(9) 



X exp< Vi6ij:x!i(3 - ex.p{-x'iP)Ni6i. 



X exp 



i=l 



i-1 



Xjiyi - Sj-i) + ^ Xk{sk - Sk-i) 



k=l 



N, logOg^ - log(A^!) - + z>^, - log 



G 



^exp(z>. 



.1=1 



where 6ij = 1 if the ith subject failed or was censored in the jth interval 
Ij, and otherwise. Since N and g are not observed, by summing (9) over 
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N and g, we obtain the likelihood function based on the observed data 
Dobs = {n, y , X, Z, u) given by 



L{P,9,ct),X\Do 



bs) 




J 

log Ok + + ^ 6ij log Xj - exp(x^/3)ifo* ivi) 



- 9k{l - exp{-exp(x',/3)i?o*(yi)}) 
X exp|z-(/)fc -log^^exp(z-(^j)^ 



where H^{yi) = E/=i Sij[Xj{yi - sj-i) + ELi -^K^z - Si-i)]- 

3. The prior and posterior distributions under the LCRM model. We 

consider a joint prior distribution for {(3,6,cj),X). Suppose that J and sj, 
j = 1, . . . , J , are fixed. First we consider a fixed G. We assume that (3, 6, 
(f) and A are independent a priori. Thus, the joint prior for {(3, 6, cj), A) is of 
the form ■K{P,6,(p,\) = 7r(/3)7r(0)7r(0)7r(A). We further assume that 

(11) /3~Afp(0,coi/p), 0fc~Af,(O,co2/g), k = l,2,...,G-l, 

J 

(12) TriX)^l[Xf-^expi-boX,), 
and 

G 

(13) 7r(6l)a nC"'exp(-6fc0fc), < Oi < 02 < ■ ■ ■ < 9g, 

k=l 

where cqi, co2, clq, bo, and 6^, A; = 1, 2, . . . , G, are the prespecified hy- 
perparameters. Due to the monotonic constraints, 9i < 62 < ■ ■ ■ < Ogi elic- 
iting the hyper-parameters and bk becomes more crucial than other hy- 
perparameters. To this end, we first specify Oq = {9oi,9o2, ■ ■ ■ j^ogY such 
that 001 < 9o2 < ■ ■ ■ < 9oG ■ Equivalently, we specify a set of prior cure rates 
exp(— ^OA;)) k = 1,2, . . . ,G. Then, we set = ^ and bk = -rh — > where cq is a 
known constant. This essentially implies that we specify the prior mean and 
the prior standard deviation of 9^ to be ^ofc and co^ofc- Thus, cq quantifies 
the prior uncertainty in ^ofc- A large value of cq reflects a vague prior belief 
in 6*0^ and a small value of cq yields a strong prior belief in 6*0^ • In Section 
5 we use the LPML and DIG measures to guide the choice of cq and and 
we then conduct a sensitivity analysis on various choices of cq and Q^. 
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Based on the prior distributions specified above, the joint posterior dis- 
tribution of /3, 0, (/), A, N and g based on the complete data D is thus given 
by 

(14) 7r(/3, 6>, 0, A, N, g\Dobs) oc L{(3, 0, 0, A|L>)7r(/3)7r(6>)7r((/))7r(A), 

where L{P,9,(I),X\D) is defined in (9). We note that when the priors Tr{(3), 
7r{6), vr(0) and 7r(A) are proper, the resulting posterior is proper. However, 
even when we take an improper prior for 6, an improper uniform prior for (3 
and an improper Jeffreys-type prior for A, that is, cqi oo and oq = 6o = 
in (12), the posterior is still proper under some mild conditions. We formally 
state this result in the following theorem. 

Theorem 1. Suppose that 7r(/3) oc 1 and 7r(A) oc l\j=iX~^. Let Xj be 
an n X (p + 1) matrix with its ith row equal to iy'i5ij{l,x^), where p is the 
dimension of f3. Assume that (i) when Vi = 1, yi> 0, (ii) dj = J27=i ^i^ij ^ 1 
for j = 1, . . . , J , (iii) there exists a j* such that Xj* is of full rank, and (iv) 
co2 > 0, Cfc > and fofc > for k = l,2,. . . ,G — 1, d+Y^'^'l a^ + aG > 0, where 
d = Yli'j=idj, and be >0. Then, the resulting posterior in (14) is proper. 

The proof of the theorem is given in the Appendix. The conditions (i)- 
(iii) are indeed quite mild and essentially require that all event times are 
strictly positive, at least one event occurs in each chosen interval (sj_i,Sj], 
and the covariate matrix is of full rank for at least one interval. These con- 
ditions are easily satisfied in most applications and are quite easy-to-check. 
We note that the condition (iv) does not require ac > 0. Thus, Tr{0) can 
be improper. We also note that the latent structure of the LCRM model 
leads to the development of a Markov chain Monte Carlo (MCMC) al- 
gorithm for sampling from posterior distribution in (14). When G is not 
specified, we assume a truncated Poisson distribution with mean fic on 
{1, 2, . . . , Gmax} for G, where fiQ and Gmax are prespecified. Then, we de- 
velop a reversible jump algorithm for carrying out posterior computation. 
The description of the MCMC algorithm for a fixed G and the detailed de- 
velopment of the reversible jump MCMC based on Lopes and West (2004) 
and Green (1995) are given in online supplementary material [Kim, Xi and 
Chen (2009)]. 

4. Posterior predictive classification under the LCRM model. In this 
section we consider classification via the posterior predictive probability. 
The latent cure rate markers under the LCRM model can be naturally used 
for the predictive classification. Let Xnew and z„eu) denote the future values 
of the vectors of baseline covariates. Also let gnew denote the future group 
indicator. Suppose that gnew takes a value between 1 and G, where G is 



(15) P{9new = k\cly,7.new,G) = _l ^ , fc = l,2,...,G. 
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fixed. Then, the conditional posterior probability for gnew given (p and 
is given by 

exp(z;e^0fc) 
E/=iexp(z;g^0J 

The posterior estimate of this predictive probability for gnew is the posterior 
expectation of P{gnew = k\cf),Znew) given by 

(16) p{k\Znew,G) =E[P{gnew = k\4>, Znew , G)\D obs] 

for k = 1,2, . . . ,G, where the expectation is taken with respect to the pos- 
terior distribution of 4> based on the observed data -Dofts- The clinical inter- 
pretation of (16) is that, given the patient's characteristic z„et«, p{k\znew) is 
the probability that the patient is in the kth risk group. 

Next, we consider the conditional predictive probability for gnew = k given 
the survival time Y >t, Xnew and Znew- This conditional predictive proba- 
bility can be calculated as follows: 

Pi,9new — k\f3, , (f). A, t, 'X-new i ^new i Cr) 

(17) 

^ ewi^'new^f^k - ^k[l - exp{-exp(x;^^/3)ifg(t)}]) 
Ez=i exp(z;g^0, -Oi[l- exp{- exp(x'„g„/3)i?*(i)}]) ' 
where H^{t) is given in (10). The posterior estimate of (17) for gnew is 

(18) p{k\t, Xnew 7 '^new ■, G) = E[P[gnew — k\(3, , cf), \,t, Xnew i '^new)\D obs]- 

From (16) and (18), it is easy to see that 

P(,k\Znew ) G) = p(^k\t = 0, Xnew ; '^new j G) 

for k = 1,2, . . . ,G. Since lim(_^oo -f^o (*) ~ ^1^° have 

exp(z;g^0fc - 9k) 



lim p{k\t, Xnew, Znew, G) = E 

E— >CXD 



Ez=iexp(z'„g„0;-6li) 



(19) 

k = l,2,...,G. 

Using the posterior predictive probability in (18), we classify a new patient 
with characteristic {xnew,'Znew) into risk group k* if 

(20) /E* = argmax p{k\t,Xnew,'Z,new,G). 

l<k<G 

An attractive property of the posterior predictive probability in (18) is 
presented in the following theorem. 

Theorem 2. The posterior predictive probability for the lowest risk group 
(k = l),pil\t 

, ^new ) ^new , 

G), increases in t, while for the highest risk group 

(k = G), p[G\t 

,^newj 'Znew,G) decreases in t. 
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The proof of Theorem 2 is given in the Appendix. Based on (19) and 
Theorem 2, we have that, for t>0, 



(21) P{9new 1 1^) ^raeui ) ^raeui ) G^) — 



exp(z'„e„0i -0i) 



Efc=iexp(z'„e^^fc -6*^) 



which is the largest probabihty that P{gnew = ll^, ^new,'^new,G) may achieve 
for t> given the patient's characteristic {^new,'2'new)- Similarly, we have 



(22) P^Snew — Crji, Xjjg^ , Zjjg^ , G) ^ 



Efc=iexp(z'„g„0fc-6'fc) 



which is the smallest probability that P{gnew = G\t,Xnew,Znew,G) can get 
for t>0. The quantity P{gnew = k\t,:ii.new,'^new,G) is clinically important 
as this gives the patient an idea how well he can do prospectively given his 
baseline characteristic. 

Finally, we note that when G is not specified, a similar posterior predic- 
tive classification algorithm can be established. For example, instead of (15), 
we compute 

P{9new = k\(f),Znew) = 7r(G) q — — — , A; = 1, 2, . . . , Gmax; 

G=i E«=iexp(z'„g„0;) 

where vr(G) denotes the prior distribution for G and Gmax is the largest 
value of G. 



5. Analysis of the prostate cancer data. We revisit the prostate cancer 
data discussed in Section 1. The response variable y is the time to prostate- 
specific antigen (PSA) recurrence. Covariates xi, X2, X3, X4 and X5 corre- 
spond to LogPSA, biopsy Gleason score, clinical tumor category and the 
year of radical prostatectomy (Year). A summary of covariates is given in 
Table 1. The covariates LogPSA (xi) and Year (xs) are continuous, while 
X2, X3 and X4 are binary. The mean and the standard deviation for LogPSA 
were 1.95 and 0.72. We also set Zj = xj for j = 1, . . . , 5. In all computations 
we standardized all covariates by subtracting their respective sample means 
and then being divided by their respective sample standard deviations. 

The hyper-parameters of the prior distribution in Section 4 are specified as 
follows. In (11), (12) and (13), we take cqi = 1000, C02 = 3, oq = 1, 60 = 0.01 
and Co = 2.5. We choose cqi to be much larger than C02 as the posterior 
is proper even when tt{P) oc 1 according to Theorem 1. Also, oq = 1 and 
60 = 0.01 are specified so that the prior for A is relatively noninformative. 
We further specify ^01 = — log(0.5) for G = 1; Oqi = — log(0.9) and 6*02 = 
-log(0.3) for G = 2- Oqi = -log(0.9), 602 = -log(0.5) and ^os = -log(O.l) 
for G = 3; ^oi = -log(0.9), 602 = -log(0.6), ^os = -log(0.3) and % = 
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-log(O.l) for G = 4; and ^oi = - log(0.9), ^02 = - log(0.7), 0o3 = - log(0.5), 
6/04 = -log(0.3) and 6^05 = -log(O.l) for G = 5. We note that (0.9,0.5,0.1) 
for G = 3 were determined by the KM estimates of cure rates based on the 
three risk groups defined in D'Amico et al. (1998, 2002). 

Table 2 shows the values of LPML and DIG for the Cox, CIS, PHPH, 
LACR and LCRM models for various J's and G's. We note that under the 
Cox model, the survival function is given by 5i(y|xj,/3, A) = 
exp{— exp(x-/3)ifo(y|'^)}) where Ho{y\X) is the cumulative baseline hazard 
function corresponding to -Fo(y|A) given in (8). From Table 2, we observe 
that there is a concave pattern in the LPMLs and there is a convex pattern 
in the DICs as functions of J for the CIS, PHPH and LCRM with fixed G. 
We note that for J = 15, the values of LPML and DIG are -827.9 and 1648.9 
for the Cox model and —838.0 and 1673.5 for the LACR model. Thus, the 
concave (convex) pattern in the LPMLs (DICs) as functions of J still holds 
for the Cox and LCRM models. Similar patterns are also observed in the 
LPMLs and DICs as functions of G for fixed J under the LCRM. Among 
the three J's shown in Table 2, J = 5 consistently fits the data better for 
the CIS, PHPH and LCRM models and J = 10 fits the data better for the 
Cox and LACR models. The LCRM model, with J = 5 and G = 3 fits the 
data best among all models considered. In particular, LPML = —816.0 and 
Die = 1613.7 for the best LCRM model while LPML = -821.5 and DIC 
= 1640.8 for the best Cox model. Except for G = 1, the LCRM model with 
G >2 improves the fit compared to the CIS, PHPH and LACR models. 

The posterior estimates of the parameters under the best LCRM model 
with J = 5 and G = 3 are given in Table 3. We see from this table that 
LogPSA is significant in the proportional hazards model (5) for the survival 
function for a "noncured" subject and LogPSA, G8H and Year of RP are 
significant in the multinomial model (6) for the latent group membership 
at a significance level of 0.05. In addition, Year of RP is nearly significant 
in both models. Although the prior cure rates for the three risk groups are 
0.9, 0.5 and 0.1, respectively, the resulting posterior estimates of these cure 

Table 1 

Summary of covariates for prostate cancer data 



Covariate Coded variable Value Definition Frequency 





LogPSA 


(—00, 00) 


Logarithm of PSA prior 


to RP 






ix2,X3) 


(G7, G8H) 


(0,0) 


Gleason score 6 or less 






866 






(1,0) 


Gleason score 7 






303 






(0,1) 


Gleason score 8-10 






66 


XA 


Cstage 


(Tl) 


Clinical tumor category 


Tic or 


T2a 


1055 






1 (T2) 


Clinical tumor category 


T2b or 


T2c 


180 


X5 


Year 


>0 


Year of RP 
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Table 2 

LPMLs and DICs of Cox, CIS, PHPH, LACK and LCRM models 



Model 


G 


J = 


1 


J 


= 5 


J = 


10 


LPML 


Die 


LPML 


DIG 


LPML 


DIG 


Cox 




-864.0 


1722.3 


-822.3 


1642.4 


-821.5 


1640.8 


CIS 




-827.4 


1651.6 


-821.6 


1641.5 


-822.4 


1643.0 


PHPH 




-831.5 


1655.5 


-824.2 


1642.3 


-825.4 


1646.9 


LACK 




-841.2 


1680.6 


-832.2 


1662.6 


-831.8 


1661.6 


LCRM 


1 


-845.8 


1686.5 


-821.3 


1640.6 


-823.3 


1645.4 




2 


-823.3 


1633.5 


-820.8 


1628.1 


-822.6 


1632.3 




3 


-822.9 


1626.3 


-816.0 


1613.7 


-819.5 


1624.6 




4 


-823.9 


1628.4 


-817.1 


1617.3 


-820.3 


1627.4 




5 


-824.4 


1629.1 


-818.0 


1620.8 


-821.7 


1634.9 



Table 3 

Posterior estimates based on the best LCRM model 





Posterior 


Posterior 


95% HPD 


Variable 


mean 


SD 


interval 


Pi (LogPSA) 


0.349 


0.107 


(0.136, 0.554) 


02 (G7) 


0.117 


0.135 


(-0.141, 0.395) 


03 (G8H) 


0.090 


0.085 


(-0.071, 0.260) 


134 (Cstage) 


0.042 


0.095 


(-0.138, 0.231) 


ft (Year) 


-0.269 


0.143 


(-0.541, 0.016) 


Oi 


0.069 


0.118 


(0.000, 0.301) 


92 


1.193 


0.582 


(0.328, 2.316) 


0s 


2.671 


1.035 


(1.443, 4.490) 


exp{-6i) 


0.939 


0.095 


(0.740, 1.000) 


exp{-62) 


0.347 


0.156 


(0.059, 0.625) 




0.092 


0.052 


(0.000, 0.181) 


(jiio (Intercept) 


0.842 


0.822 


(-0.907, 2.431) 


(LogPSA) 


-1.841 


0.549 


(-2.962, -0.963) 


cj^i2 (G7) 


-1.162 


0.924 


(-3.349, 0.183) 


(pi3 (G8H) 


-1.801 


1.068 


(-4.128, -0.030) 


(t>i4 (Cstage) 


-0.694 


0.554 


(-1.818, 0.181) 


015 (Year) 


0.840 


0.373 


(0.106, 1.597) 


4)20 (Intercept) 


-0.385 


1.152 


(-2.670, 1.827) 


421 (LogPSA) 


-3.123 


0.987 


(-5.160, -1.127) 


Cj}22 (G7) 


1.118 


1.058 


(-0.991, 3.163) 


</>23 (G8H) 


-0.178 


1.662 


(-3.479, 2.878) 


4)24 (Cstage) 


-0.741 


1.266 


(-3.224, 1.628) 


425 (Year) 


1.144 


0.851 


(-0.519, 2.884) 
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rates are 0.939, 0.347 and 0.092. Under the same model setting for Table 

3, the posterior predictive probabilities, p{k\t,Xnew,^new,G) given in (19) 
with Znew =^new, are computed for three sets of baseline covariates x„eu)'s 
for various i's and the results are given in Table 4. Based on the proposed 
classification criterion given in (20), these probabilities clearly indicate that 
a patient with a PSA level of 5, Gleason 6 or less, and tumor stage Tl 
belongs to risk group 1 (low risk group) and a patient with a PSA level of 
30, Gleason 8 to 10, and tumor stage T2 falls into risk group 3 (high risk 
group) no matter whether he had surgery in 1988 or 2001. However, a patient 
with a PSA level of 5, Gleason 7 and tumor stage T2 may be classified into 
risk group 3 (high risk group) if he had surgery in 1988 while a patient with 
the same PSA level, Gleason score and tumor stage will be classified into 
risk group 2 (intermediate risk group) if he had surgery in 2001. From Table 

4, we also see that for each set of baseline covariates, the risk classification 
does not change no matter how long the patient will live if he had surgery 
in 2001 and this is not the case when he had surgery in 1988. In addition, 
the overall cure rates, 

S{Qo\:Sinew,Znew,P,0,<l)) = > exp(-6'fc)— ^ — — , 

are presented in Table 4. It is interesting to see that when (PSA, Gleason, 
Cstage) = (5, <6, Tl), the overall cure rate is much smaller than that given 
9new = 1, when (PSA, Gleason, Cstage) = (5, 7, T2), the overall cure rate is 
greater than that given gnew = 2 if he had surgery in 2001, while the over- 
all cure rate is very similar to the risk group specific cure rate {gnew = 3) 
when (PSA, Gleason, Cstage) = (30, 8-10, T2). Figure 1 shows the esti- 
mated risk group specific PSA recurrence free probabilities corresponding 
to these three sets of covariates and the estimated overall PSA recurrence 
free probability when the year of RP was 2001. From plots (a), (b) and (c), 
we see that three risk group specific probability curves are well separated 
from each other. These plots also show that a wrong classification may lead 
to either over-estimate or under-estimate of the PSA recurrence free prob- 
ability. Thus, the posterior predictive classification is quite important, as a 
correct classification leads to more accurate estimates of the cure rate as 
well as the PSA recurrence free probability. 

We further conducted a sensitivity analysis on the choice of cq and ^oj's- 
Table 5 shows the LPML and DIG values of the LCRM model with G = 3 
for various cq and the prior cure rates exp(0o) = (0.9,0.5,0.1), (0.8, 0.5, 0.2) 
and (0.7, 0.5, 0.3). Both LPML and DIG values are very similar for almost 
all choices of cq. Among all values of cq and exp(0o)i cq = 2.5 and exp(0o) = 
(0.9, 0.5, 0.1) yield the largest LPML and the smallest DIG among all choices 
considered. Although not reported in Table 5, the posterior estimates of 
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Table 4 

Posterior predictive probability based on the best LCRM model 



Year 


PSA 


Gleason 


Stage 


t 


p{k = l\t) 


p{k = 2\t) 


p{k = 3\t) 


Overall cure rate 


1988 


5 


<6 


Tl 





0.692 


0.099 


0.209 


0.705 













U.ol4 


U.UoO 


n 1 m 
U.lUl 












oo 


u.yiu 


U.UOi 


U.Uzy 






5 


7 


T2 





0.057 


0.384 


0.559 


0.234 










5 


0.132 


0.412 


0.456 












oo 


0.238 


0.426 


0.336 






30 


8-10 


T2 





0.001 


0.053 


0.947 


0.098 










5 


0.005 


0.068 


0.928 












CO 


0.008 


0.072 


0.920 




2001 


5 


<6 


Tl 





0.745 


0.241 


0.014 


0.781 










5 


0.770 


0.220 


0.010 












CO 


0.868 


0.130 


0.002 






5 


7 


T2 





0.183 


0.657 


0.160 


0.409 










5 


0.220 


0.653 


0.127 












CO 


0.327 


0.618 


0.055 






30 


8-10 


T2 





0.006 


0.143 


0.851 


0.117 










5 


0.020 


0.166 


0.815 












CO 


0.042 


0.186 


0.772 





the cure rates were also calculated under those choices of cq and the prior 
cure rates. For example, when cq = 2.5, the posterior estimates of the cure 
rates and the corresponding posterior standard deviations are (0.939, 0.347, 
0.092) and (0.095, 0.156, 0.052) for exp(0o) = (0.9,0.5,0.1), (0.936, 0.351, 
0.091) and (0.097, 0.161, 0.051) for exp(0o) = (0.8, 0.5, 0.2), and (0.936, 
0.352, 0.094) and (0.087, 0.163, 0.051) for exp(0o) = (0.7, 0.5, 0.3). Similar 
results are obtained for other choices of cq. These results demonstrate that 
the proposed LCRM model is quite robust to the specification of cq and 
prior cure rates. 

When G is not specified, we used the RJMCMC algorithm given in Kim, 
Xi and Chen (2009). In the RJMCMC algorithm, we took a = 3 for and 
di = 0.5 for (f)i, / = 1, 2, . . . , G — 1. We specified the transition matrix as 
follows: 



/o.o 


1.0 


0.0 


0.0 


0.0 \ 


0.5 


0.0 


0.5 


0.0 


0.0 


0.0 


0.5 


0.0 


0.5 


0.0 


0.0 


0.0 


0.5 


0.0 


0.5 


\0.0 


0.0 


0.0 


1.0 


0.0/ 



The dimension of model, G, is assumed to follow a Poisson distribution 
with mean fiQ = 3 and truncated between 1 and 5. Also J is fixed to 
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g= 


=1 


g= 


= 2 


g= 


= 3 



4 6 
Time (Years) 



(a) 



(b) 




Fig. 1. Plots of the estimated risk group specific PSA recurrence free probabilities cor- 
responding to (PSA, Gleason, Cstage) = (5, <6, Tl) (a), (5, 7, T2) (b), and (30, 8-10, 
T2) (c), and the estimated overall PSA recurrence free probability (d) for year of RP — 
2001. 



be 5. Under the above setting, the posterior probabihties of G are com- 
puted and these are P(G = = 0.0, P{G = 2\Dabs) = 0.224, P{G = 
3\Dobs) = 0.534, P{G = 4\Dobs) = 0.242, and P{G = 5\Dobs) = 0.0, respec- 
tively. Therefore, the model with G = 3 has the highest posterior model 
probability. This result is consistent with the best model identified by the 
LPML and DIG measures shown in Table 2. We also conducted a sensi- 
tivity analysis on the specification of in the prior distribution for G. 
Specifically, we obtained that P{G = l|Do6,) = 0.0, P{G = 2\Dobs) = 0.251, 
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Table 5 

LPMLs and DICs of the LCRM model for various co and prior cure rates 



Prior cure rates 



(0.9, 0.5, 0.1) (0.8, 0.5, 0.2) (0.7, 0.5, 0.3) 



Co 


LPML 


Die 


LPML 


DIG 


LPML 


DIG 


0.5 


-816.9 


1615.4 


-817.7 


1616.5 


-817.8 


1617.2 


1.0 


-816.6 


1614.9 


-817.2 


1616.0 


-817.2 


1616.6 


1.5 


-816.4 


1614.3 


-816.7 


1615.3 


-816.7 


1616.0 


2.0 


-816.2 


1614.1 


-816.4 


1615.0 


-816.6 


1615.5 


2.5 


-815.9 


1613.7 


-816.2 


1614.6 


-816.3 


1615.1 


3.0 


-816.1 


1614.0 


-816.2 


1614.8 


-816.4 


1615.4 


3.5 


-816.1 


1614.4 


-816.3 


1615.1 


-816.6 


1615.8 


4.0 


-816.4 


1614.8 


-816.4 


1615.5 


-817.0 


1616.1 


10.0 


-816.6 


1615.4 


-816.9 


1615.9 


-817.3 


1616.5 



P{G = 3\Dobs) = 0.535, P{G = 4|Dofc,) = 0.214, and P{G = 5\Dobs) = 0.0 for 
fiG = 2, and P{G = l\Dobs) = 0.0, P{G = 2\Dobs) = 0.162, P{G = 3\Dobs) = 
0.536, P{G = MDobs) = 0.302, and P{G = 5\Dobs) = 0.0 for ^iq = 4. Thus, 
the model with G = 3 consistently has the highest posterior model proba- 
bility for all three choices of ftc- 

In all the computations, we first generated 100,000 Gibbs samples with a 
burn-in of 4000 iterations, and we then used 20,000 iterations obtained from 
every 5th iteration for computing all posterior estimates, including posterior 
mean, posterior standard deviation, 95% highest posterior density (HPD) 
intervals and LPML. The computer codes were written in FORTRAN 95 
using IMSL subroutines with double precision accuracy. The convergence of 
the MCMC sampling algorithm was checked using several diagnostic proce- 
dures as recommended by Cowles and Carlin (1996). 

6. Discussions. In Section 5 we used LPML and DIG measures to assess 
the goodness of fit of the models for different choices of G and J. LPML is 
a well-established Bayesian model comparison criterion based on the con- 
ditional predictive ordinate (GPO) statistics, which is particularly suitable 
for the cure rate models. Let GPOj denote the GPO statistic for the ith 
subject. LPML is defined as 

n 

LPML = ^log(GPOi). 

i=l 

The larger the LPML, the better the fit of a given model. Letting 7 denote 
the vector of all model parameters and L{'-f\D(,bs) the likelihood based on 
the observed data Dobs, the DIG is defined as 

DIG = Dev(7) + 2pz), 
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where Dev(7) = — 21ogL(7jL'o{,s) is a deviance function, 7 is the poste- 
rior mean of 7, pm = Dev(7) — Dev(7), and Dev(7) is the posterior mean of 
Dev(7). For the LCRM model, 7 = (/3,0,0,A) and L(7|L»o6,) = 
L{l3,6,cl),X\Dobs), which is given by (10). The DIG is a Bayesian measure 
of predictive model performance, which is decomposed into a measure of fit 
and a measure of model complexity (pd). The smaller the value of DIG, the 
better the model will predict new observations generated in the same way as 
the data. As discussed in Spiegelhalter et al. (2002), DIG is the Bayesian ver- 
sion of the Akaike Information Griterion (AIC) [Akaike (1973)]. Unlike AIG, 
the dimensional penalty in DIG is automatically calculated without actu- 
ally counting the number of parameters. Although the dimensional penalty 
is not explicitly shown in LPML, LPML has a dimensional penalty similar 
to AIG as derived by Gelfand and Dey (1994) based on the asymptotic ap- 
proximation. Moreover, as discussed in Ibrahim, Ghen and Sinha (2001), the 
LPML measure is particularly suitable for comparing cure rate models, as 
the moments do not exist under these models. 

As discussed in Sections 1 and 2, there are several cure rate models for 
survival data with a cure fraction recently developed in the literature. There 
is a distinct difference between the proposed model and the existing ones. 
Specifically, the new model is to no longer explain the cure fractions directly 
according to covariates but to divide the population into latent classes char- 
acterized by specific cure rates and being described according to covariates. 
This nice feature of the proposed model allows us to develop the predictive 
classification algorithm for classifying patients into different risk groups. The 
proposed mixture model falls within the latent class modeling framework. 
The latent class models are commonly used for analyzing complex sample 
survey data. For survey data, a latent class model is often used to explain 
unobservable categorical relationships or latent structures that characterize 
discrete multivariate data [Dayton (1999), Agresti (2002) and Patterson, 
Dayton and Graubard (2002)]. Recently, latent class models have been de- 
veloped for survival data. Lin et al. (2002) proposed latent class models 
for joint longitudinal and survival data. They assumed a Gox proportional 
hazards model with time-varying covariates for the survival endpoint and 
each latent class represents certain pattern of longitudinal and event-time 
responses. Larsen (2004) extended the Gox model to encompass a latent class 
variable (an indicator of the unobserved status of health or functioning) as 
predictor of time-to-event. However, the literature on the latent class model 
for survival data with a cure fraction is still sparse. Based on the subset of 
the data published in D'Amico et al. (2002), we showed in Section 5 that 
the proposed model with three latent cure rate markers fits the data best 
based on LPML, DIG and the reversible jump of Green (1995). This finding 
is consistent with the prostate cancer literature, as the three risk groups are 
routinely used in the prostate cancer clinical practice. 
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Although the proportional hazards (PH) assumption is assumed for the 
cumulative hazard function H{y) for noncured subjects in (5), the result- 
ing survival function is not PH due to the nature of the mixture model. 
To examine the PH assumption, we first considered the generalized odds- 
rate hazards (GORH) model discussed in Banerjee et al. (2007). We then 
compared various GORH models for H{y) based on the LPML and DIG 
measures to see whether a PH model for H{y) is appropriate. The results, 
which are available in Kim, Xi and Ghen (2009), empirically confirm that 
the PH assumption for H{y) may be appropriate for the prostate cancer 
data discussed in Section 1. 

In Section 5, the covariates considered include only PSA, biopsy Gleason 
score, clinical tumor category and year of RP due to the limitation of the 
prostate cancer data we had. However, it will not add much additional com- 
putational difficulty to incorporate more covariates into the proposed model. 
Unlike D'Amico et al. (1998, 2002), the proposed model does not require 
any prespecified cutoff values of the covariates in classifying patients into 
different risk groups. The proposed method is potentially useful in clinical 
applications as it allows doctors to include as many important covariates as 
possible, some of which may be discovered later on due to medical advances, 
for obtaining a more accurate risk classification. 

In the LCRM model, we assume that there are G unknown latent ^g/s. 
Instead of the latent class model, we may assume a mixture of the Dirichlet 
Process (MDP) model discussed in Ibrahim, Chen and Sinha (2001) for the 
cure rate parameters. Specifically, we assign an unknown 6i to each subject 
and then assume a Dirichlet Process prior for 9i. In Section 2.2, we assume 
a piecewise exponential model for the baseline hazard function ho{y). One 
possible extension to this is to assume a gamma process prior for hQ{y), 
which leads to a semiparametric LGRM model. These two extensions of the 
LGRM model are currently under investigation. 



APPENDIX: PROOFS OF THEOREMS 



Proof of Theorem 1. After summing out N and g, we have 
7r(/3, 9, cf), X\Dobs) oc 7r*(/3, 6>, 0, X\Dobs) 
= L{/3,e,ct),X\Dobs) 



r J 1 [- G 



X UOl^-'eM-hOk) vr(0) 



lj=i J U.=i J 
where L{l3,9,(f),X\Dobs) is given by (10). It suffices to show that 



(A.l) 
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It is easy to show that 

L{P,e,ct,,X\Dobs] 



< n ^Gexp|x^/3 + ^(^,,logA,-exp(x:/3)i/o*(yi)|- 



Using condition (iv), we can show 



G 



■K{cf)) dO dcj) < oo 



n^r"'exp(-6fcefc) 
Lfc=l 

due to the constraints, < < ^2 < ■ • • < ^G, and the condition, > and 
6fc > 0, for A; = 1, 2, . . . , G - 1. Let 



7r*{l3,X\Dabs)= n exp<^x^/3 + ^<5ij logAj -exp(x^/3)//o*(yi) 

.{i:Ui = l} I j = l 

In order to estabhsh (A.l), we only need to prove 
(A.2) j ■K*{(3,X\Dohs)d(3d\<oo. 

Consider the transformation Uj =log(Aj), and let u = (ni, . . . , uj)'. Then, 
d\j = Xjduj , J = 1, 2, . . . , J, and 



Tr*{p,n\Dobs)=7T*{f3,X\Dobs 
J 



9(Ai,A2,...,Aj) 



d{ui,U2,. ■ ■ ,uj) 



n n {exp(n,+x:,/3)}^- 

j = l{i:Ui = l} \ 

X exp-^ -Jijexp(x-/3) 



exp{uj){yi - Sj-i) 

i-i 
1=1 



Letting 5ij^ = 1 and 5ij = for j ^ ji , we have 
^*(/3,u|Z)„fe,)<7r**(/3,u|I),6,) 
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= n exp(txj^ + X-/3) exp{-(yi - sj^-i) exp(Mj, + x-/3)}, 

{i:i/i = l} 

and it suffices to show that / tt** {P,u\Dobs) d(3 du < oo. We rewrite '/r**(/3, 
u\Dobs) as 

TT**{(3,u\Dobs) 

J 

= n n exp(uj + X-/3) exp{-(?/i - exp(Mj + x-/3)} 

i = l {i:ui = l,ji=j} 

= n n exp(Mj +x-/3)exp{-(2/i - Sj_i)exp(Mj +X-/3)} 

jVi* {i:Ui = l,ji=j} 

X ]J exp(Mj + X-/3) exp{-(yi - Sj-i) exp(uj + x-/3)}. 

Since > 1, there exists Sj_i < yi - < Sj for j ^ j* . Thus, 

n n exp{uj + X-/3) exp{-(yi - sj-i) exp{uj + x-/3)} 
{i-'^i=i,ji=j} 

<KiY[ exp{uj + x-^/3) exp{-(yi^. - Sj_i) exp(uj + x-^./3)}, 

where > is a constant, and 

/ n n exp(uj + X-/3) exp{-(yj - sj-i) exp{uj + x-/3)} 

/oo 
exp(nj + x-^./3) exp{-(yi^ - Sj-i) exp{uj + x.[^P)}duj 

where K2 > is a constant. For j = j* , without loss of generality, we as- 
sume yi*,...,yi* ^ S {sj*-i, Sj*], and X*,, which has the Ith row (l,x^*), 
i = 1, . . . ,p + 1, is of full rank. Therefore, 

Yl exp(uj. + X-/3) exp{-(yi - Sj*-i) exp('Uj* + x-/3)} 

{i:iyi = l,ji=j*} 
p+l 

< ^3 n exp{u* + x-*/3}exp{-(yi* - Sj*_i) exp(u* + x-./3)}, 
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where K3 > is a constant. Now consider the transformation w = {wi, . . . , 
Wp^i)' = X*, ("^*), which is a one-to-one transformation. We have 



/ ]T^^P("i* +x-„/3)exp{-(?/i* -Sj*_i)exp(uj. + y^^ (3)} duj* d(3 
oc / TT exp(tt;;)exp{-(yi* - Sj.__i)exp(i(;;)}(ii(;i 



p+l 
1=1 



Therefore, 

< 00, 

where > is a constant. This completes the proof. □ 

Proof of Theorem 2. It is sufficient to show that P{gnew = ^, </>, 
X^t^'Sf-newi'^newiG) for k = \ {k = G) increases (decreases) in t. We can 
rewrite the conditional predictive probability in (18) as 

^ exp(z;e^(?!>fc) 

E[Liexp(z'„,^0,)exp(-(0, -0fc)[l -exp{-exp(x;,„/3)i?o*(i)}])' 

Since Hq (t) is an increasing function of t, 6i — 9i> for / > 1, and 9i — 9g < 
for I < G. Thus, p{k\t,:x.new,'^new,G) increases in t for A; = 1 and decreases 
in t for A: = G. □ 
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SUPPLEMENTARY MATERIAL 

Checking the proportional hazards assumption and computational de- 
velopment (DOI: 10.1214/08-AOAS238SUPP; .pdf). In online supplemen- 
tary material we provide the empirical results for checking the proportional 
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hazards assumption and the description of the Markov chain Monte Carlo 
(MCMC) sampling algorithm for a fixed G and the detailed development of 
the reversible jump MCMC. 
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